I bet you have seen tons of beautiful (well, not always) map art on the web. If not, I’ve got you covered:

credit: https://www.mapshop.com/

Looks cool, right? However after finding more and more sites offering similarly looking maps I was not inclined to leave my money with any of them. This gave me a thought that despite those maps may bear low artistic value they still could be a great source of learning. So equipped with basic R skills, some experience with vector graphics and faint memories of what longitude and latitude are I approached this project.

##Where should I start

After a short research I figured out that there are plenty of R libraries capable of pulling, processing and plotting GIS data, so not to get lost completely (try not to yourself) I decided to develop a set of requirements for the final output to narrow a choice for me.

Here are my expectations - High Resolution, print-ready, minimalist map of my hometown (Volgograd, Russia).

Now translation:

High Resolution - if I learned anything working for 2 years in Printing services is that to get a great quality in print you don’t want to mess with any raster, you need a vector image, this way you will be able to modify and scale it to any size without quality loss.

Print ready - map should be fully completed in R without use of other graphic editing software like Photoshop. This is more of a self-inclined limitation to see how far R could take me.

Volgograd is far from US, meaning US Census GIS data won’t help me much.

Minimalist Map - should not include too many detail, I will shoot to have 3 layers / features like roads, parks and water.

That way I ended up with 4 libraries:

library(osmdata) # wrapper for Overpass API from Open Street Maps
library(sf) # library for manipulating Simple features objects
library(dplyr) 
library(ggplot2)

DISCLAIMER: Initially I approached this project with very low knowledge of GIS and OSM in particular so I made many mistakes and poor design choices along the way. I explored new packages, explored the geodetic theory, explored limits of my patience and determination :) so treat this notebook as explorers’ logbook. If you’d like to follow along and build a map for yourself, refer to streamlined version of my workflow published on GitHub Yes it is streamlined as (!SPOILER) I ended up building 6 maps for myself.

Pull the data

Let’s look at opq() function - it wraps all query parameters into a single API call.

vog <- opq("Volgograd")
head(vog)
## $bbox
## [1] "48.4070531,44.1087686,48.8890717,44.6874279"
## 
## $prefix
## [1] "[out:xml][timeout:25];\n(\n"
## 
## $suffix
## [1] ");\n(._;>;);\nout body;"
## 
## $features
## NULL
## 
## $osm_types
## [1] "node"     "way"      "relation"

$bbox - opq() had transformed toponym “Volgograd” into bounding box - matrix of latitude and longitude coordinates used to limit the query to a specific area. In above example bounding box is defined by city’s administrative boundary. Best way to illustrate is to use OSM export interface

vog$bbox
## [1] "48.4070531,44.1087686,48.8890717,44.6874279"

$prefix and $suffix - provide output format and resource parameters as well as all required syntax to form a valid Overpass QL statement.

$osm_types - define the elements query need to pull (see more here)

I know, it”s tempting to query and pull every little detail OSM has to offer, something like:

vog_all_data <- osmdata_sf(vog)

but such query would take too much time and resources to execute, without saying that resulting dataset would be too big and would eat up most of your RAM.

Hence $features - help to narrow the query by passing it specific map features. Feature can be specified on a high-level as a single key (e.g. “amenity” or “building”) or be more specific as a key-value pair (e.g. “amenity”=“healthcare” or “building”=“commercial”). For my minimalist map I would pull all water shapes, roads and possibly parks or other vegetation. I don’t want to map full area of the city but instead will zoom-in closer to the downtown. So here”s updated bbox:

min_lon <- 44.3587
max_lon <- 44.7054
min_lat <- 48.6103
max_lat <- 48.8462

bbox <- c(min_lon,min_lat,max_lon,max_lat)
vog <- opq(bbox = bbox)

and now lets pour some water:

vog_water <- vog %>% 
  add_osm_feature(key = "natural",
                  value = "water") %>% 
  osmdata_sf()

the last line instructs query to treat output as simple features objects

vog_water
## Object of class 'osmdata' with:
##                  $bbox : 48.6103,44.3587,48.8462,44.7054
##         $overpass_call : The call submitted to the overpass API
##                  $meta : metadata including timestamp and version numbers
##            $osm_points : 'sf' Simple Features Collection with 25942 points
##             $osm_lines : 'sf' Simple Features Collection with 52 linestrings
##          $osm_polygons : 'sf' Simple Features Collection with 320 polygons
##        $osm_multilines : NULL
##     $osm_multipolygons : 'sf' Simple Features Collection with 16 multipolygons

As you can see apart from some metadata about the query, the resulting object has multiple dataframes with geometries: points, lines and polygons (the best explanation I could find on what’s the difference between a line and multiline or polygon and multipolygon is here).

Let’s plot all those shapes to see difference.

ggplot() +
  geom_sf(data = vog_water$osm_multipolygons,
          color = NA,
          fill = "steelblue")+
  labs(title = "Multipolygons")+
  coord_sf()+
  theme_void()

ggplot() +
  geom_sf(data = vog_water$osm_polygons,
          color = NA,
          fill = "steelblue")+
  labs(title = "Polygons")+
  theme_void()

ggplot() +
  geom_sf(data = vog_water$osm_lines,
          color = "steelblue")+
  labs(title = "Lines")+
  theme_void()

ggplot() +
  geom_sf(data = vog_water$osm_points,
          color = "steelblue")+
  labs(title = "Points")+
  theme_void()

Work with geometries

Points and lines are less relevant for this type of object and it seems that multipolygons layer is exactly what I need. But it appears that polygons may actually have some extra water shapes that are missing on multipolygon layer. (here’s the probable reason). Let’s compare the two dataframes.

#take only non-intersecting shapes from polygon layer
poly_diff <- st_difference(vog_water$osm_polygons,vog_water$osm_multipolygons)

ggplot() +
   geom_sf(data = poly_diff,
          fill = "violet",
          color = NA)+
  geom_sf(data = vog_water$osm_multipolygons,
          fill = "steelblue",
          color = NA)+
  theme_void()

As you can see there are plenty of water shapes missing from multipolygon layer, but it seems that polygon dataframe besides water also includes islands and islets.

Let’s take a peek at polygon dataframe once more,

unique(vog_water$osm_polygons$natural)
## [1] NA      "water" "sand"

Looks like field $natural identifies whether polygon is water or something else, and it appears that there are not only water shapes. Let’s plot it for vizibility:

ggplot() +
  geom_sf(data = vog_water$osm_polygons,
          aes(fill = natural),
          color = NA)+
  theme_void()

And here’s the “A-Ha” moment. Let’s filter out anything that is not water and merge it again with the multipolygon layer:

vog_water$osm_polygons <- vog_water$osm_polygons %>% 
  filter(natural == "water")

all_vog_water <- st_union(vog_water$osm_polygons,vog_water$osm_multipolygons)

ggplot()+
  geom_sf(data = all_vog_water,
          fill = "steelblue",
          color = NA)+
  theme_void()

Success!

Pull little more data

Let’s move on to the next feature and load the data for roads:

vog_roads <- vog %>% 
  add_osm_feature(key = "highway") %>%
  osmdata_sf() 

ggplot() +
  geom_sf(data = vog_roads$osm_lines,
          aes(color=highway))+
  theme_void()

What a beautiful mess. Let’s clean it up and group it into 3 buckets: pedestrian, minor and main roads.

pedestrian <- vog_roads$osm_lines %>% 
  filter(highway %in% c("footway","track","path"))

minor_road <- vog_roads$osm_lines %>% 
  filter(highway %in% c("residential","tertiary", "secondary", "tertiary_link","secondary_link","primary_link","trunk_link"))

main_roads <- vog_roads$osm_lines %>% 
  filter(highway %in% c("primary","trunk"))

ggplot()+
  geom_sf(data = pedestrian,
          linewidth = .1,
          alpha = .2)+
  geom_sf(data = minor_road,
          linewidth = .5,
          alpha = .7)+
  geom_sf(data = main_roads,
          linewidth = 1)+
  theme_void()

Looks good. Onto the next one. Landscape around Volgograd is mostly prairies, however there are some woods, including man-made, around the city. So I got curious to see how many:

vog_woods <- vog %>% 
  add_osm_feature(key = "natural",
                  value = "wood") %>%
  osmdata_sf() 

vog_woods <- st_union(vog_woods$osm_polygons,vog_woods$osm_multipolygons)

ggplot() +
  geom_sf(data = vog_woods,
          fill = "darkgreen",
          color = NA)+
  theme_void()

Volgograd has suffered a lot during the Second World War and hence there are hundreds of memorials, monuments and ruins in the city with the most famous being a giant statue - ‘the Motherland Calls’ War memorial - The Motherland Calls Let’s see what OSM has to offer on that.

vog_historic <- vog %>% 
  add_osm_feature(key = "historic") %>%
  osmdata_sf() 

Finally we have all four layers, Let’s put them together, crop and apply some fancy colors.

background_fill <-"#E0D4BE"
woods_fill <-"#78B8A0"
key_color <- "#3B3B3B"
water_fill <- "#B4D9C8"
historic_fill <- "#EB0062"

vog_map <- ggplot()+
  
  #water
  geom_sf(data = all_vog_water,
          fill = water_fill,
          color = NA,
          size = .1)+
  
  #woods
  geom_sf(data = vog_woods,
          fill = woods_fill,
          color = NA)+
  
  #historic
  geom_sf(data = vog_historic$osm_polygons,
          fill = historic_fill,
          color = NA)+
  
  #roads
  geom_sf(data = pedestrian,
          color= key_color,
          size = .1,
          alpha = .2)+
  geom_sf(data = minor_road,
          color = key_color,
          size = .4,
          alpha = .3)+
  geom_sf(data = main_roads,
          color= key_color,
          size = .6,
          alpha = .6)+
  
  #crop the map to the bbox
  coord_sf(xlim = c(min_lon,max_lon),
           ylim = c(min_lat,max_lat),
         expand = FALSE)+
 
  theme_void()+
  
  #add background fill
  theme(panel.background = element_rect(fill = background_fill,
                                       color = key_color))
vog_map

I love how it turned out, however it’s not over yet. In the beginning I’ve used OSM Export interface to find min/max coordinates for my bounding box, however I wasn’t accounting for any aspect ratio or specific size but since I need to prepare file for print I need to consider what frame / paper size would be used for this wall-art. I’ve decided to put my print into 5 by 5 inch frame, I need it to be perfectly square (or maybe a circle?). That is a 2 minute job for me in Photoshop, but the intent was to learn how to do it the hard way (and as it appears the right way).

In fact the principle is similar to graphic editor masking, you need to draw a shape which then will be used as a mask for the map layer. Easiest way to get a perfectly square bounding box is to wrap it around a circle. And to draw a circle you just need a midpoint and radius.

This supposed to be a Chapter 1

To begin with we need a center coordinates. I’ve chosen Memorial Eternal Flame in downtown Volgograd as the midpoint for the mask and radius of 10 kilometers (hope my “imperial” friends will forgive me this heresy).

lat <- 48.70842 
lon <- 44.51527
r <- 10000

midpoint <- st_point(c(lon,lat)) %>% 
  st_sfc() %>% 
  st_set_crs(4326) 

I found tons of tutorials on how to crop a map to a circle, but none took time to explain what exactly was happennig. So here’s what I managed to find out

Function st_point constructs a geometry object (point) from the given list of numbers. While st_set_crs instructs that provided coordinates are mapped to EPSG:4326 coordinate reference system - CRS.

Bloody the hell, does it even mean anything?

Apparently yes, however to make sure I finish this notebook before I turn 60 I will not explain it all here. Kudos to Justyna Jurkowska and 8th Light for the most humane summary of all this Geodetic mess. So if you are really interested to understand it all - read the above article or watch this video in case you just want to have fun. regardles of your choice, here are few things you need to know:

How does above relates to our circle? OSMs data model uses ESPG:4326 which is a spherical model and therefore the only area where distortions will be minimal are along the equator, but as object is getting closer to the poles the more distorted it becomes. If we just keep CRS as EPSG:4326 and plot the circle to our map it will appear distorted since Volgograd is located almost half-way between the equator and the North Pole. To illustrate that let’s draw a circle from our midpoint using both CRSs:

midpoint_4326 <- midpoint
midpoint_2707 <- midpoint %>% st_transform(crs=2707)
midpoint_4326
## Geometry set for 1 feature 
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 44.51527 ymin: 48.70842 xmax: 44.51527 ymax: 48.70842
## Geodetic CRS:  WGS 84
## POINT (44.51527 48.70842)
midpoint_2707
## Geometry set for 1 feature 
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 464432.1 ymin: 5397405 xmax: 464432.1 ymax: 5397405
## Projected CRS: Pulkovo 1995 / 3-degree Gauss-Kruger CM 45E
## POINT (464432.1 5397405)

Notice how coordinates transformed from lat/lon into meters. Functionst_buffer creates a circle geometry from the set point with radius equal to r

circle_mask_distorted <-st_buffer(midpoint_4326, dist = r)
circle_mask <- st_buffer(midpoint_2707, dist = r) %>% st_transform(crs=4326)
vog_map+
  geom_sf(data = midpoint,
          color = 'red')+
  geom_sf(data = circle_mask,
          color = 'blue',
          fill = NA)+
  geom_sf(data = circle_mask_distorted,
          color = 'red',
          fill = NA)+
 # geom_sf(data = box_mask,
 #         color = key_color,
 #         fill = NA)+
  coord_sf(xlim = c(min_lon,max_lon),
           ylim = c(min_lat,max_lat),
         expand = FALSE)

So Red Circle is the buffer we created directly in 4326 CSR and blue one was transformed into 2707 and then projected back in 4326.

Technically it means that all the shapes I’ve plotted so far are also somewhat distorted, because they all mapped relative to EPSG:4326 model (which is not a big deal really, since this map will never be used for navigation), however I still decided to change all geometries to 2707, just to see how it works.

st_transforms function allows to move SF objects to another CRS and st_intersection function returns a geometry that represents the intersection of 2 input geometries.

circle_mask <- st_buffer(midpoint_2707, dist = r)
crs <- 2707

water_masked <- all_vog_water %>% 
  st_transform(crs = crs) %>% 
  st_intersection(circle_mask)
historic_masked <- vog_historic$osm_polygons %>% 
  st_transform(crs = crs) %>% 
  st_intersection(circle_mask)
woods_masked <- vog_woods %>%
  st_transform(crs = crs) %>%
  st_intersection(circle_mask)
pedestrian_masked <- pedestrian %>%
  st_transform(crs = crs) %>%
  st_intersection(circle_mask)
minor_road_masked <- minor_road %>%
  st_transform(crs = crs) %>%
  st_intersection(circle_mask)
main_roads_masked <- main_roads %>%
  st_transform(crs = crs) %>%
  st_intersection(circle_mask)

All layers are masked and transformed so now we map them again:

vog_map_circle <- ggplot()+
  
  geom_sf(data = circle_mask,
          fill = background_fill,
          color = NA)+
 
  #water
  geom_sf(data = water_masked,
          fill = water_fill,
          color = NA
          )+
  
  #woods
  geom_sf(data = woods_masked,
          fill = woods_fill,
          color = NA)+
  

  #historic
  geom_sf(data = historic_masked,
          fill = historic_fill,
          color = NA)+
  
  #roads
  geom_sf(data = pedestrian_masked,
          color= key_color,
          linewidth = .4,
          alpha = .3,
          )+
  
  geom_sf(data = minor_road_masked,
          color = key_color,
          linewidth = .5,
          alpha = .8,
          )+
  
  geom_sf(data = main_roads_masked,
          color= key_color,
          linewidth = 1,
          )+
   
  theme_void()

vog_map_circle

Last piece is to add some text caption, so anyone can see what’s that place on a map.

Add title

R is infamous for fonts support, so I had to load 2 more libraries to access Google fonts and bring some beauty into my captions.

library(sysfonts)
library(showtext)
all_fonts <- tibble(font_families_google())
font_add_google("Poiret One", "poiret")  
showtext_auto()
 vog_map_circle+
   
     geom_sf(data = circle_mask,
          fill = NA,
          color = key_color,
          linewidth = 0.8)+
   
   labs(caption = "ВОЛГОГРАД")+

   
    theme(plot.caption = element_text(family = "poiret", 
                                    color = key_color,
                                    size=24,
                                    hjust=0.5),
          plot.margin = unit(c(0.5,0.5,0.5,0.5),"in"))

Notice that font size above was set only to render it correctly in this notebook, but once you will be ready to save your artwork be sure to adjust font size to the resolution you use. Default screen resolution is 72 dots (points) per inch, meaning is you set your font size to 36 it will take half of an inch on your screen. To get a sharp image in print you need a much higher resolution (300 dpi is the standard for print products), so to render your correctly you need to adjust it’s size to a roughly 4 times larger.

Final step is to save the image:

ggsave("volgograd.png", plot = last_plot(),
       scale = 1, width = 1500, height = 1500, units = "px",
       dpi = 300)

And we are done!

Last minute I decided I wasn’t happy with the colors so the end result on my wal looks much different, but it actually inspired me to make a map for all other cities I lived in.